Literature Search Made Easy

نویسندگان

  • Bo Wang
  • Min Liu
چکیده

M any of us have been faced with this situation: we found a certain field of research intriguing, went to an online archive to search for related papers, and the results were just less than satisfactory: they are too unstructured to obtain a big picture, such as how this field evolves over time, what sub-fields it is composed of. Instead of relying on human instinct for such insights, our project provides an alternative: using machine learning algorithms to peruse thru online archives, generate the information of interest, and present them in a human friendly way. In this project, we used hierarchical latent Dirichlet allocation (hLDA) to find the topics and their hierarchy and dynamic topic model (DTM) to find the time evolution of topics. In addition to using these two models, we also made an important extension to them constructing the corpus with key phrases instead of keywords. Furthermore, we devised an algorithm to remove the duplication in word and phrase count, which enabled us to generate a corpus composed of both keywords and key phrases. This approach allowed us to greatly improve the interpretability of the result since a large number of concepts in science and technology are formulated as phrases. In our experiments with 14185 condensed matter physics papers from arXiv, the above methods effectively found the trend and hierarchy of research topics and revealed some very interesting insights. In our report, the Trends in Research Topics section demonstrates the analysis of topic trends, advantage of using key phrases versus keywords and a caveat in word and phrase count when constructing the corpus. The Topic Hierarchy section presents the generation of a hierarchy of topics and the solution to the caveat aforementioned.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Semantic Based Retrieval of Biomedical Literature

The published literature in bioinformatics and the biomedical field is rapidly increasing and the total citations maintained by PubMed surpass 16 million. While it is very encouraging to witness the rapid advancement of research in this important area, this enormous volume creates a problem for researchers to keep up with the current literature. Text-based search portals such as Google help to ...

متن کامل

Systematic review of EASY-care needs assessment for community-dwelling older people

BACKGROUND undertaking comprehensive geriatric assessments (CGAs) combined with long-term health and social care management can improve the quality of life of older people [ 1]. The EASY-Care tool is a CGA instrument designed for assessing the physical, mental and social functioning and unmet health and social needs of older people in community settings or primary care. It has also been used as...

متن کامل

Cone-beam computerized tomography (CBCT) imaging of the oral and maxillofacial region: a systematic review of the literature.

This study reviewed the literature on cone-beam computerized tomography (CBCT) imaging of the oral and maxillofacial (OMF) region. A PUBMED search (National Library of Medicine, NCBI; revised 1 December 2007) from 1998 to December 2007 was conducted. This search revealed 375 papers, which were screened in detail. 176 papers were clinically relevant and were analyzed in detail. CBCT is used in O...

متن کامل

TPX: Biomedical literature search made easy

UNLABELLED TPX is a web-based PubMed search enhancement tool that enables faster article searching using analysis and exploration features. These features include identification of relevant biomedical concepts from search results with linkouts to source databases, concept based article categorization, concept assisted search and filtering, query refinement. A distinguishing feature here is the ...

متن کامل

A New Method for Forecasting Uniaxial Compressive Strength of Weak Rocks

The uniaxial compressive strength of weak rocks (UCSWR) is among the essential parameters involved for the design of underground excavations, surface and underground mines, foundations in/on rock masses, and oil wells as an input factor of some analytical and empirical methods such as RMR and RMI. The direct standard approaches are difficult, expensive, and time-consuming, especially with highl...

متن کامل

Anesthetic Considerations in a Case of Fahr and Primrose Syndrome

Copyright 2018 by Gupta B. This is an open-access article distributed under Creative Commons Attribution 4.0 International License (CC BY 4.0), which allows to copy, redistribute, remix, transform, and reproduce in any medium or format, even commercially, provided the original work is properly cited. In this review article, brief description, concerns, anesthetic management and perioperative ca...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013